-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ENH: Read from compressed data sources #11677
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -234,18 +234,28 @@ class ParserWarning(Warning): | |||
fields if it is not spaces (e.g., '~'). | |||
""" % (_parser_params % _fwf_widths) | |||
|
|||
def get_compression(filepath_or_buffer, encoding, compression_kwd): | |||
""" | |||
Determine the compression type of a file or buffer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move this entire function to pandas.io.common
; call it maybe get_compression_type
.
This should be a bit more lightweight as you can see there are already other functions which use the compression. This will simply infer from a keywork and/or file extension and return the type of compression. Which can then be passed to other routines.
@@ -31,7 +31,7 @@ New features | |||
Other enhancements | |||
^^^^^^^^^^^^^^^^^^ | |||
|
|||
|
|||
- `read_pickle` can now unpickle from compressed files (:issue:`<num>`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add in the actual number here (11666)
needs some tests |
can you add some tests and upate |
@khs26 can you update |
1 similar comment
@khs26 can you update |
closing but pls reopen if you wish to continue working |
This is the start of an effort to address compression in
read_*
methods (at the moment, justread_pickle
, as per #11666. So far, I've factored out the compression handling in the_read
function ofpandas.io.parsers
and used that in the pickling read routine.Tests on the way, along with actual testing that it works, docs and (if we want), I can also put something similar in the other
read_*
methods too.